-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
silu_and_mul fused moe #208
base: main
Are you sure you want to change the base?
Conversation
@Chi-Chu319 Can you please re-trigger the CI? |
# Calculate new pid based on the new grouping | ||
# Note that we need to consider the following two cases: | ||
# 1. the current pid is on a tall xcd | ||
# 2. the current pid is on a short xcd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, can I get an example how a block is mapped onto the 8-die chip ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the same is in https://github.com/ROCm/triton/blob/main_perf/python/perf-kernels/gemm.py#L110
Example of a kernel with 100 pids:
The pids are assigned to the XCDs in a round robin fashion, so pid 0 goes to XCD 0, pid 1 goes to XCD 0. So on a so forth.
In the end, XCD 0,1,2,3 gets 13 pids and XCD 4,5,6,7 gets 12 pids
remapping permute the pid sequence so that
PID: [0, 1, 2, 3, 4, ..., 99]
| | | | | |
XCD: [0, 1, 2, 3, 4, ..., 3]
is mapped to
PID: [0, 13, 26, 39, 52, 64, 76, 88, 1, 14, 27, ..., 99]
| | | | | | | | | | | |
XCD: [0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 2, ..., 3]
So e.g. before XCD 0 gets pid: [0, 8, 16, ...]
, XCD 1: [1, 9, 17, ...]
after the remapping XCD 0: [0, 1, 2, ...]
, XCD 1: [13, 14, 15, ...]
. So XCDs only work with adjacent pids.
silu_and_mul fused moe
Migrated from ROCm/triton#710